Refining the Research Focus

  • Initial Research Direction: Explore proposed penalization by Zilberman & Abramovich (2025).

  • Problem: The proposed method relies on Lasso and SLOPE as convex surrogates.

  • Refined Project Focus: This project centers specifically on SLOPE’s (Sorted L-One Penalized Estimation) performance for count data.

Introduction

SLOPE (Sorted L-One Penalized Estimation) is a method for estimating the parameter \(\beta\) in a parametric statistical model.

  • Similar to LASSO, this algorithm incurs a penalty term based on the \(\ell_{1}\) norm of the estimator \(\hat{\beta}\).
  • Unlike LASSO, SLOPE does not use a constant term \(\lambda\) to calculate the penalty which is applied to the model fit.

Comparison of \(\ell_{1}\) Penalties

  • The penalty term in LASSO regression is \(\lambda\sum_{i=1}^{p}\left\vert\hat{\beta}_{i}\right\vert\).
  • The SLOPE penalty is given by \(\sum_{i=1}^{p}\lambda_{i}\left\vert\hat{\beta}_{(i)}\right\vert\).
    • In this equation, \(\lambda_{1} \ge \lambda_{2} \ge \dots \ge \lambda_{p} \ge 0\), and the elements of \(\hat{\beta}\) are sorted so that \(\left\vert\hat{\beta}_{(1)}\right\vert \ge \dots \ge \left\vert\hat{\beta}_{(p)}\right\vert\).

Background - Multiple Hypothesis Testing and FWER

  • Example - gene testing with \(n\) patients and \(m > n\) predictors
  • Original setting - linear regression with known variance
  • FWER - family-wise error rate
    • Probability of making at least one type I error
  • Bonferroni correction
    • Set \(\alpha_{\text{BON}} = \frac{\alpha}{m}\)

Background - FDR

  • FDR - false discovery rate
    • Expected proportion of false rejections
  • Benjamini-Hochberg
    • Order the p-values
    • Find largest \(p\) such that \(p_{(j)} \leq \frac{qj}{m}\)

From Hypothesis Testing to Inference - LASSO

  • Model selection can be viewed as multiple hypothesis testing
    • If coefficients are 0 they are not significant
  • For orthogonal design matrices LASSO is equivalent to Bonferroni
    • Orthogonal design matrix: columns are orthogonal
    • LASSO: \(\min \frac 1 2 \|y-X\beta\|_2^2 + \lambda \|\beta\|_1\)

From Hypothesis Testing to Inference - SLOPE

  • What if we use BH like penalties instead?
    • \(\lambda_{\text{BH}}(i) := \phi^{-1}\left(1-\frac{qi}{2m}\right)\)
    • \(\min \frac 1 2 \|y-X\beta\|^2 + \sigma \cdot \sum_{i=1}^m \lambda_{\text{BH}}(i)|\beta_i|\)
  • Key differences
    • Non-homogenous penalty
    • Sorting of coefficients

SLOPE on orthogonal design matrices

  • Provably controls FDR for orthogonal design matrices with known Gaussian errors
  • Convex optimization problem
  • Not exactly equivalent to Benjamini-Hochberg

SLOPE in general

  • Coefficients don’t have to be related to BH
  • Need to obey \(\lambda_1 \geq \lambda_2 \ldots \lambda_p \geq 0\)
  • Suggested use
    • Use SLOPE for model selection
    • Once a model is selected, find coefficients with OLS

Experiments

Lorem ipsum dolor sit amet:

  • consectetur adipiscing elit,
  • sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
  • Ut enim ad minim veniam, quis nostrud exercitation

Experimental Design: Proposal

  • Research Gap: Original SLOPE paper (Bogdan et al., 2015) focused on linear models / Gaussian errors[cite: 1].
  • Goal: Compare SLOPE against other penalized methods (Lasso, Adaptive Lasso) via simulations specifically for count responses (Poisson model).
  • Research Question: How does the performance of SLOPE regarding variable selection accuracy (FDR and Power) compare to Lasso and Adaptive Lasso when applied to high-dimensional Poisson regression?

Compared Methods & Penalties

  • Objective: Minimize \(- \frac{1}{n} \log L(\beta; y, X) + \text{Penalty}(\beta)\)
  • Penalties:
    • SLOPE Penalty: \(\sum_{i=1}^{p}\lambda_{i}|\beta|_{(i)}\), \(|\beta|_{(1)} \ge ... \ge |\beta|_{(p)}\), \(\lambda_1 \ge ... \ge \lambda_p \ge 0\).
    • Lasso (L1): \(\lambda ||\beta||_{1}\)
    • Adaptive Lasso: \(\lambda \sum_{j=1}^{p} w_j |\beta_j|\) (where \(w_j \propto 1/|\hat{\beta}_{init, j}|^\gamma\))

Experimental Design: Data Generation

  • Model: Poisson Regression
    • \(Y_i \sim \text{Poisson}(\lambda_i)\)
    • \(\log(\lambda_i) = \beta_0 + X_i \beta\)
  • Dimensions: \(n = 1000\) observations.
    • \(p = 500\) (p < n)
    • \(p = 1000\) (p = n)
    • \(p = 2000\) (p > n)
  • Replications: R = 50 runs per setting.

Experimental Design: Predictors (X)

  • Generation: \(X_{ij} \sim N(0, 1)\), columns standardized.
  • Correlation Structures:
    • Independent: \(\rho = 0\)
    • Moderate: \(\rho = 0.5\)
    • High: \(\rho = 0.8\)

Experimental Design: True \(\beta\)

  • Sparsity \(k = ||\beta||_0\):
    • \(k = 10\)
    • \(k = 20\)
    • \(k = 50\)
    • \(k = 100\)
  • Signal Strength:
    • Simulate “Weak” \(\beta\) scenarios.
    • Simulate “Strong” \(\beta\) scenarios.

Experimental Design: Parameter Tuning

  • Lasso, Adaptive Lasso:
    • 10-fold Cross-Validation.
    • Select tuning parameter(s) by minimizing Poisson deviance.
    • (Specify initial estimator and \(\gamma\) for Adaptive Lasso)
  • SLOPE:
    • Target FDR level \(q = 0.1\).
    • Use BH-inspired sequence \(\{\lambda_i\}\).

Experimental Design: Evaluation Metrics

  • False Discovery Rate.
  • Power.